Replicator Graph Clustering

نویسنده

  • Michael Donoser
چکیده

Motivation Data clustering, which aims at identifying a varying number of clusters and unique cluster assignments in a fully unsupervised manner, is an important topic in machine learning and computer vision. Many of the available methods in this field such as k-means or mean shift are based on a Euclidean assumption. In this work we overcome the shortcomings of such an assumption by considering the underlying similarity manifold using diffusion processes, which allows to handle non-metric data. Core idea of our novel approach is to combine an effective diffusion process, based on iteratively approaching evolutionary stable strategies from the field of game theory, with a provably optimal clustering step that analyzes a specific graph structure, that we denote as Replicator Graph. Our clustering method (Replicator Graph Clustering) belongs to the field of pairwise or proximity-based based clustering approaches assuming that the input is an N×N affinity matrix A = ( ai j ) , where each entry ai j measures the similarity between two specific data points-to-beclustered i and j. The goal of clustering is to uniquely assign each of the N data points to one of a set of clusters C = (C1,C2, . . .CC), where C is an automatically found number of clusters. Our method mainly consists of three subsequent steps (a) diffusing affinities by finding personalized evolutionary stable strategies of non-cooperative games (b) building a mutual k-nearest neighbor graph representing the underlying manifold and (c) applying a graph based clustering strategy which identifies the final clusters. Individual steps have low computational complexity which leads to an efficient clustering method, scaling well with an increasing number of data points.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral Clustering with Epidemic Diffusion

Spectral clustering is widely used to partition graphs into distinct modules or communities. Existing methods for spectral clustering use the eigenvalues and eigenvectors of the graph Laplacian, an operator that is closely associated with random walks on graphs. We propose a spectral partitioning method that exploits the properties of epidemic diffusion. An epidemic is a dynamic process that, u...

متن کامل

Replicator equation on networks with degree regular communities

The replicator equation is one of the fundamental tools to study evolutionary dynamics in well-mixed populations. This paper contributes to the literature on evolutionary graph theory, providing a version of the replicator equation for a family of connected networks with communities, where nodes in the same community have the same degree. This replicator equation is applied to the study of diff...

متن کامل

Finding Community Base on Web Graph Clustering

Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...

متن کامل

Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members

Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...

متن کامل

Geometry-Aware Neighborhood Search for Learning Local Models for Image Reconstruction

Local learning of sparse image models has proven to be very effective to solve inverse problems in many computer vision applications. To learn such models, the data samples are often clustered using the K-means algorithm with the Euclidean distance as a dissimilarity metric. However, the Euclidean distance may not always be a good dissimilarity measure for comparing data samples lying on a mani...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013